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Abstract 

We recover the first linear programming bound of McEliece, Rodemich, Rumsey, and 
Welch for binary error-correcting codes and designs via a covering argument. It is possible 
to show, interpreting the following notions appropriately, that if a code has a large distance, 
then its dual has a small covering radius and, therefore, is large. This implies the original 
code to be small. 

We also point out (in conjunction with further work) that this bound is a natural isoperi- 
metric constant of the Hamming cube, related to its Faber-Krahn minima. 

While our approach belongs to the general framework of Delsarte's linear programming 
method, its main technical ingredient is Fourier duality for the Hamming cube. In particular, 
we do not deal directly with Delsarte's linear program or orthogonal polynomial theory. 



1 Introduction 

This paper takes another look at the first linear programming bound on binary error correcting 
codes, or, alternatively, on optimal packing of Hamming balls in a Hamming cube. 

The bound was originally proved by McEliece, Rodemich, Rumsey, and Welch [15], follow- 
ing Delsarte's linear programming approach [7j. Delsarte showed the distance distribution of a 
binary code to satisfy a family of linear constraints whose coefficients can be viewed as values 
of a certain family of orthogonal polynomials, i.e., the Krawchouk polynomials. This made it 
possible to construct a linear programming relaxation of the original combinatorial question, 
and to view the obtained linear program as an extremal problem in orthogonal polynomials. 
Good feasible solutions of the dual program were constructed in |T5] using tools from the the- 
ory of orthogonal polynomials. These solutions lead to the bound, known as the first linear 
programming bound (or the first JPL bound). This bound is the best known upper bound on 
cardinality of a code with a given minimal distance, for a significant range of distances. 

Delsarte's approach extends to a family of finite metric spaces, known as commutative 
association schemes. A Hamming cube is one example of an association scheme. Another 
relevant example is the Hamming sphere. In |15j good feasible solutions to Delsarte's linear 
program for the Hamming sphere are constructed. These lead to best known upper bounds on 
constant weight error correcting codes (ball packing in the Hamming sphere), and, combined 
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with the Bassalygo-Elias inequality, to the best known upper bound on binary codes. This 
bound is known as the second linear programming bound. 

We refer to |X5|, [Fl] HI [2] for a detailed exposition of the notions discussed above, including 
error-correcting codes and their significance, packing in metric spaces, association schemes, 
Delsarte's linear program, and orthogonal polynomials. 

The point of view presented in this paper is somewhat different. Our main tool is Fourier 
analysis on the group , or, equivalently, on the Hamming cube {0, l} n . We follow the approach 
of Kalai and Linial [10] in which the characteristic function of a binary code is viewed as a real- 
valued function on the cube. A study of the Fourier transform of this function and its simple 
by-products makes it possible to recover Delsarte's linear program in a form which does not 
require treatment of Krawchouk polynomials. 

Moreover, this viewpoint allows an easy access to new geometric information. Specifically, 
we establish a simple relation between the minimal distance (equivalently, packing radius) of a 
code and the essential covering radius of its dual. Recall that r is a covering radius of a subset 
C of {0, l} n if the union of Hamming balls of radius r centered at the points of C covers the 
whole space. Here we use a somewhat weaker notion, and require this union of balls to cover 
only a significant fraction of the space. 

This observation, which we consider to be the main contribution of this paper, leads to 
a simple proof of the first linear programming bound. In particular, we do not need to deal 
directly with Delsarte's linear program or orthogonal polynomial theory. 

We move to the principal definitions and to the statement of the main results. 

A binary error- correcting code with block length n and minimal distance d is a subset of the 
n-dimensional Hamming cube in which the distance between any two distinct points is at least 
d. Let A(n, d) be the maximal size of such a code. In this paper we are interested in the case in 
which the distance d is linear in the length n of the code, and we let the length n go to infinity. 
In this case A(n, d) is known [11] to grow exponentially in n, and we consider the quantity 

R(6) = limsup — log 2 A(n, [Sn\), 

also known as the asymptotic maximal rate of the code with relative distance 5 for < 5 < I . 

Next, we need the notion of a maximal eigenvalue of a subset of the cube. We say that two 
elements x, y of {0, l} n are adjacent and write x ~ y if the Hamming distance between x and 
y is 1. Let A be the adjacency matrix of the obtained graph. For B C {0, l} n , set 

X B = max [jfjy; f ■ {0, l} n -> R, supp(/) C B 

In other words, Xb is the maximal eigenvalue of adjacency matrix of the subgraph of {0, 1}" 
induced by the vertices of B. 

Our main technical claim is 

Proposition 1.1: Let C be a code with block length n and minimal distance d. Let B be a 
subset of {0, l} n with Xb > n — 2d + 1. Then 

\C\ < n\B\ 
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A linear code C is a linear subspace of Fg. The dual code C 1 - is the orthogonal subspace, 
that is it contains all the vectors orthogonal to C over F2. Proposition 11.11 has an appealing 
geometric interpretation for linear codes. 

Proposition 1.2: Let C be a linear code with block length n and minimal distance d. Let B 
be a subset of {0, l} n with \b > n — 2d + 1. Then 



U ( z + B ) 

zec 1 - 



2" 
> — 
n 



In other words, replacing every point in the dual code by a (shifted) copy of B, we will cover 
a large fraction of the space {0, l} n . Proposition 11.11 for linear codes is an immediate corollary 
of (pQ) since \C\ ■ \C ± \ = 2 n . 

A code C has dual distance d if Fourier transform of its characteristic function vanishes 
on points of Hamming weight < |iS| < d. In particular, the dual distance of a linear code is 
easily seen to equal the minimal distance of its dual (cf. discussion in Subsection 12 . X [) . Hence, 
the following claim generalizes Proposition 11.21 

Proposition 1.3: Let C be a code with block length n and dual distance d. Let B be a subset 
of {0, l} n with \ B > n- 2d + 1. Then 

|J (* + *) >- (!) 

zee 

Hamming balls are a good choice for the covering set B. 
Lemma 1.4: Let B(r) be a Hamming ball of radius r. Then 



A B ( r ) > 2y/r(n - r) - o{n) 

Proposition 11.31 together with Lemma 11.41 lead to a relation between the dual distance of a 
code and its essential covering radius. 

Corollary 1.5: Let C be a code with block length n and dual distance d. Then the essential 
covering radius of C is at most 

r <\~ Vd(n -d)+ o{n) (2) 

In particular, let C be a linear code with block length n and minimal distance d. Then the 
essential covering radius of the dual code C 1 - is at most r < S — \J d(n — d) + o(n). 



3 



Proposition 11.11 together with Lemma 11.41 lead to an upper bound on the size of a code 
C with block length n and minimal distance d. They show that there exists a radius r < 
j — yj d{n — d) + o(n) such that 

|C|<n|B(r)| (3) 

Corollary 11.51 gives a geometric explanation of this bound for a linear code C. The balls of radius 
r centered at the points of the dual code C 1 - cover an (1/n) -fraction of the space. Therefore 
\C ± \ ■ \B(r)\ > 2 n /n, and \C\ = 2 n /\C ± \ < n\B{r)\. This allows us to view the bound © as a 
covering bound. 

For a general code the covering interpretation of ([3]) is more tenuous since, in particular, 
there is no natural notion of the dual code. However, the analytic reasoning leading to ([3]) can 
be viewed as a functional version of the covering argument above (see Subsection I2.3|) . 

The cardinality of a Hamming ball of radius r is 2 n (- ff ( r / n ) + °( 1 )) |12j . Substituting the value 
r = ^ — \J d(n — d) + o(n) on the right hand side of ([3|), we have 

|C| < 2 n ( H ( 1/2 "V d /™( 1 - d /"'))+°( 1 )) 

This bounds the asymptotic maximal rate of a code with relative distance 5, 



R(5) <H(l/2- y/5 (1 - 5) 



This is the first linear programming bound for error-correcting codes. 

Finally, a code with dual distance d is a design of strength d [llj (or a {d — l)-wise inde- 
pendent set [1]). Proposition 11.31 together with Lemma [L41 lead to the first linear programming 
bound for designs [15J . 

Summing up 

Three notions of duality are relevant to this discussion. The first is linear programming 
duality as represented by the primal and dual linear programs of Delsarte. Recall that the 
primal linear program of Delsarte is a relaxation of the combinatorial question on cardinality of 
an optimal code. The linear programming bounds on codes are obtained by constructing good 
feasible solutions for the dual program. 

The second notion is the Fourier duality, illustrated by the Kalai-Linial approach to the 
problem. Viewing the characteristic function of a code as a real-valued function on the cube, 
and studying the properties of this function and its Fourier transform lead to an equivalent 
version of Delsarte's linear program. 

The third notion is the duality between packing and covering problems in hypergraphs [2]. 
The vertices of the pertinent hypergraph are the vertices of the cube and the edges are Hamming 
balls. The fractional packing and covering problems are dual linear programs. This induces a 
duality relation between their integer versions which are of interest here. Generally, covering is 
much easier than packing. For instance, integrality gap for covering is logarithmic at worst [13J, 
while for packing it could be much larger \2\. In the context of coding theory, the asymptotics 
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of optimal packings are unknown, while the asymptotics of optimal coverings are easy to find 



The main observation of this paper is that, in our case, Fourier duality makes it possible to 
pass from a "hard' packing problem of finding the maximal cardinality of a code with a given 
minimal distance to an "easy" covering question of determining the minimal cardinality of a 
code with a given covering radius. We suggest that this point of view might explain the power 
of the resulting bound, namely the first linear programming bound for error-correcting codes. 

We also point out that this bound is a natural isoperimetric constant of the Hamming cube, 
related to its Faber-Krahn minima ([El [18], see the discussion below). 

Related work 

1. Our research was motivated by a recent result of Friedman and Tillich [8]. Using methods 
from algebraic graph theory the authors prove the first linear programming bound for 
linear binary codes. In particular, Proposition 11.11 for linear codes and Lemma 11.41 are 
proved in 00 The appeal of [8] is in suggesting a way to work with Delsarte's linear 
inequalities without resorting to the language and tools of orthogonal polynomial theory. 

2. Combining the approach of Friedman and Tillich with the Fourier-analytic view of Del- 
sarte's linear program due to Kalai and Linial allowed us to extend this approach to 
general binary codes, with, we believe, a simpler proof. After completing our work on the 
conference version of this paper [16] , we learned that Fourier analysis was used in a similar 
manner by Cohn and Elkies [5 J to give a simpler proof of Levenshtein's bound on sphere 
packing in W 1 . In particular, [5] contains (somewhat implicitly) arguments analogous to 
our proofs of Proposition 11.11 and Lemma ll .41 

3. The relation between the dual distance of a code and its covering radius has been exten- 
sively investigated in the coding literature (see pjJJ [3] and the references there). The best 
known bounds are obtained via linear programming approach and are somewhat weaker 
than ([2]). This, of course, stands to reason, since covering radius of a code is, in general, 
larger that the essential covering radius. The best known upper bound on the covering 
radius r c of a code with dual distance d is [19] 



Better bounds are known for linear codes [3]. 

Extensions and ramifications 

1. The approach of this paper extends to general commutative association schemes |17] , 
Given an association scheme with k + 1 classes, it is possible to a formal "Fourier trans- 
form" on vectors in This transform, contrary to a genuine Fourier transform, is 

1 We give a different proof of Lemma [l.4l based on an explicit construction of a function with a large Rayleigh 
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ratio. 
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not self-dual. The solution is to define a pair of 'inverse transforms'. An appropriate pair 
of linear transformations is given by the transition matrices of the scheme. It is possible 
to define a convolution operation for each of these transforms, which is commutative and 
associative, and which is taken by the transform to a point-wise multiplication. This al- 
lows to recover the best known bounds for codes and designs in commutative association 
schemes via a Fourier-analytic proof similar to the proof of Proposition 11.11 

2. Friedman and Tillich [8] ask what are the optimal covering subsets of the Hamming cube, 
in the sense of Proposition 11.11 This question is answered in |18j , by way of a modified 
logarithmic Sobolev inequality for the Hamming cube. If B is a Hamming ball and X 
is a subset of H with \X\ = \B\, then Ax < (1 + o(l)) • Xb- Hence Hamming balls are 
asymptotically optimal. In the terminology of [8], Hamming balls are the Faber-Krahn 
minimizers for the Hamming cube (up to a negligible error). 

2 The proofs 

2.1 Fourier analysis on F£ 

We refer to [9] for background in Fourier analysis on FJ • Here we list several necessary definitions 
and simple facts. 

F2 is a finite Abelian group, therefore its characters {WgjseF™ constitute a group (the dual 
group which is isomorphic to Fg.) The character Ws is a function from F2 to {—1, 1}, defined 
as: Ws(x) = (— l)( x '^). The characters {WgjsgF^ form an orthonormal basis in the space of 
real- valued functions on F??, equipped with uniform probability distribution. 

Write Ef for £ £* e F« /(*). For / : Ff - R, define / : FJ - R as f(S) = (/, W s ) = 
E (/ • Ws)- The function / is the Fourier Transform of /. The Parseval identity states E/g = 

(f,g) = d = 

For /, g : FJ} — ► R, the convolution of / and g is defined by (/* g)(x) = M y f(y)g(x + y). The 
convolution transforms to dot product: / * g = f -g. The convolution operator is commutative 
and associative. 

Finally, we need to know Fourier transforms of some simple functions. The following facts 
are easily verifiable. Let / = lc be the characteristic function of a linear code C. Then 

/ = M. lci . LetL W = { jf . Then2( S )=„-2|S|. 

Note, for future use, that a code C C {0, l} n has minimal distance d if and only if 
(lc * lc) ( x ) = for < \x\ < d. For a linear code C this is equivalent to lc(%) = for 
< \x\ < d. Observe also that for a function / on the cube holds Af = f * L. 

2.2 The proof of Proposition 11.31 

We start with a simple observation that a function supported on a small set has a large ratio 
between its second moment and the square of its first moment. Indeed, let a function F be 
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supported on a set U. Then, by the Cauchy-Schwarz inequality, 

E 2 F = (F, ljy} 2 < EF 2 • E (If/) 2 = — • EF 2 (4) 

2 n 

Hence, to prove ([1]) it suffices to define a function F supported on Uzec ( z + B) with 
§f^ < n. Consider the adjacency matrix of the subgraph of {0, 1}™ induced by the vertices 
of B. Let fs be an eigenfunction of this matrix corresponding to its maximal eigenvalue Xb- 
That is, fs is supported on B and Xb = ^f/gj^) ■ Since the matrix A is nonnegative, so is the 
function and we have Afs > \b/b- To see this, note that Afs = Xsfs on B and, since 
Afs is nonnegative, the inequality holds outside B. 

For typographic convenience we will write X = Xb and / = fg from now on. 

For a point z in the Hamming cube, let f z be a shifted version of /, taking f z (x) = f(x + z). 
Define 

zee" 

This is a nonnegative function supported on (JzeC ( z + -^)- We estimate the inner product 
(AF, F) in two ways. 

One one hand, 

AF = F * L = (1 C / * /) * L = \ c , * (/ * L) = l c > * Af > X (l c , * f) = XF 

Therefore (AF, F) > X B (F, F) = X B EF 2 . 
On the other hand, by Parseval's identity, 

(AF, F) = (AF, F) = (L-F,F) = ((n- 2\S\)F, f) = - 2|S|)F 2 (S) 



Now, F(S) = • f(S) = 0, for < \S\ < d. Hence, 

^(n - 2|S|)F 2 (S) = nF 2 (0) + ^ (n - 2|,S|)F 2 (S) < nF 2 (0) + (n - 2d) ^ F 2 (S) < 
S \S\>d \S\>d 



nF 2 (0) + (n - 2d) ^ F^S 1 ) = nE 2 F + (n - 2d)EF 2 



J = TME< -T -f ^/i — )1SL,F 2 

s 

Combining the two estimates on {AF, F) and recalling A > n — 2d + 1, we get 

nE 2 F > (A - (n - 2d)) EF 2 > EF 2 , 

completing the proof. I 
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2.3 The proof of Proposition 11.11 

The outline of the following proof is very similar to that of Proposition 11,21 We suggest that 
it is worthwhile to view this proof as a functional version of the preceding proof. In particular, 
in light of (|3|) and ([5]) below, it is useful to define the "essential support size" of a function g 

Let (ft be a function on the Hamming cube such that (ft 2 = lc * lc- I n other words, 
(ft * (ft = l c * l c . Since the Fourier transform on the cube is an involution, up to normalization, 
we have (ft * (ft = 2 n lc * lc = 2™ lc ■ What is important is that 

E6 2 (d>* d>)(0) , , 
(ft* (ft > and -g-= W ^Z , { J = C 

^ <P (ft*(ft(0) 

That is, the essential support size of (ft is j^. Note that, for a linear code C , we can choose (ft 
to be (a multiple of) 1q±. 

Take F = (ft * / . We will show that EF 2 < nK 2 F. It will take an easy additional step to 
deduce the desired inequality |C| < n\B\. 

As before, we estimate the inner product (AF, F) in two ways. On one hand, 
(AF, F) = (((ft * f) * L,(ft * f) = ((ft * (ft * f, f * L) = (0*0*f,Af) > 

\(</>*</>* f,f) = \(</>* f,<l>* f) = \(F,F) = X EF 2 

On the other hand, {AF, F) < nE 2 F + (n — 2d)KF 2 . The proof of this fact is exactly the same 
as in the proof of Proposition 11.21 and we omit it. 

Combining the two estimates and the assumption A > n — 2d + 1 implies EF 2 < nE 2 F. 

Now, E 2 F = E 2 ((ft * f) = E 2 (ft E 2 f. On the other hand, 

EF 2 = (F, F) = ((ft * f, (ft * /) = ((ft * (ft, f * /) > 1 ((ft * (ft) (0) (/ * /) (0) = ^E(ft 2 Ef 2 

The inequality follows from nonnegativity of (ft* (ft. Since / is supported on B, the calculation 
in ([J]) implies 



\B\>T^L>l-%^-\C\, (5) 
Ej z nE z <p n 



completing the proof. I 



2.4 The proof of Lemma 11.41 



We prove the lemma by constructing an explicit function / supported on B = B(r) with 

(AfJ) 
</,/> 



> A = 2\J r(n — r) — o(n). In fact, we will guarantee more, namely / > and Af > A/. 
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The Hamming ball B contains all the points x of the Hamming cube with Hamming weight 
\x\ < r. The function / will be symmetric, namely its value at a point will depend only on 
the Hamming weight of the point. Such a function, of course, is fully defined by its values 
/(0), ...,/(n) at Hamming weights 0...n. 

For a symmetric function g on {0, l} n holds Ag(i) = igii — 1) + (n — i)g(i + 1). We start 
with a preliminary construction of a symmetric function g, setting g(0) = 1 and defining g{i) 
for 1 < i < n so that the relation 

\g(i) = ig{i - 1) + ( n - i)g(i + 1) (6) 

is satisfied for i = 1, n — 1. We will show below that there exists an integer p < r and a real 
number A such that the function g is nonnegative on the integers i = 0, ...,p and nonpositive 
on p + 1, and that A > 2y / r(n — r) — o(n). 

Given this, we define / = g for i = 0, ...,p and / = otherwise. Clearly, / is nonnegative 
and supported on B. We claim A/ > Xf. Indeed, by definition, Af(i) = Xf(i), for i < p—1 and 
for i > p + 1. It remains to check the two boundary values. For i = p + 1, > = Xf(i). 

For i = p, 

Af(i) = pf(p - 1) + (n - + 1) = p/(p - 1) = PS<P - 1) > 

Pg(p - 1) + (n - p)g(p + 1) = Xg{p) = Xf(p). 

The inequality holds since g(p + 1) < 0. 

It remains to show A > 2y / r(n — r) — o(n). We will show that there is a function r(A) = 
(1 + o(n)) • — 2 ~ — such that g = is negative at an integer point p < r(X). Writing A as a 
function of r gives the relation we need, that is A > 2\J r(n — r) — o(n). 

Fix e > 0. Let t = gzV^EEH . w e w iU assume that g is positive on the interval [0, (1 + e)t] 
and obtain a contradiction, for a sufficiently large n. 

By the definition of g, 

g(i + l) = X9 ^- i9{i - 1) . 

n — i 

Set ^(i) = j^Zi) • Since / is positive on [0, (1 + e)t] , for any i in [t, (1 + e)t] holds 6{i + 1) > 
{> {■ On the other hand, we claim that for any i > (1 + e/2)t holds #(i + 1) < 6{i) ■ (1 - <5), 
for a positive constant (5 depending on e. These two facts evidently cannot coexist, giving the 
desired contradiction. 

Indeed, for any i > holds 0(i + 1) = -^ti — ^ n -i)e{i) • ^ e c ^ m * nat f° r i > (1 + e/2)t, and 
for any x > holds 

- A" < (1 - S)x, 
n — i (n — ?ja; 

for some (5 = (5(e) > 0. In fact, the discriminant of this quadratic inequality in x is easily seen 
to be negative, for a sufficiently small S. | 
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